Recommendation System Based on Social Relations
Short answer: Recommender systems are widely used today and in simple terms, their purpose is to suggest items or products to those who might be interested in them. Recommender systems are now essential because of information over load and interaction overload .Information overload refers to the scenario where there is too much availability of data so that it becomes difficult to make a decision regarding a product or an item And there is the new phenomena of interaction overload .Loads of information regarding a single subject is available through many sources .These information sources may be blogs, micro blogs .wikis etc. .Hence recommendation systems are used to recommend items to a user ,which match his interest ,filtering out details that the recommender system feels irrelevant . Social media refers to web 2.0 sites where can share content and also interact with each other .Social media is characterized by user centered design (personalized according to user’s profile ) ,user generated content (tags ,comments ,blogs) and social networking sites and online communities such as face book ,twitter and orkut .in social networking sites like face book ,a user can make friends with others ,tag them and comment on photos or videos uploaded by them. It also provides platform to form online communities with others who share similar interest in a particular subject .Thus a user profile in a Social networking sites can be used to make relevant personalized recommendation for that user .That is integrating the information available about a user in a SNS with traditional recommender systems improves the accuracy of recommending various items .The information available in the social networking site can be used to leverage the information and recommender systems can be used to filter that information to arrive at the recommendations for a user . Long Answer: Traditional recommender systems include content based recommendation system and collaborative recommendation system .In content based filtering an item is recommended to a user, if it is similar to other items rated by the user .The similarities between the items is calculated taking into account various attributes of the item .Basically in content based recommender system tries to create an user profile based on the items the user has rated in the past. Then this profile is used to recommend new items to the users that match his profile. Content based recommender systems suffer from a cold start problem generally known as the “New user problem “.CB requires the user to rate sufficient number of items in order to create a user based profile. Also calculating the similarities between the items to be recommended and which are already rated by the user is quite difficult as it requires comparison of significant number of attributes of the items .And it requires more of users effort ,rating sufficient items ,in order to be recommended an item . The other recommender system that is widely used is the collaborative filtering. In collaborative filtering ,recommendation for an object for a particular user is based on the ratings given to that object by similar users (i.e.),collaborative filtering generates recommendations for an object based on inter user similarity .In much narrower sense ,if a person A has the same opinion as person B on an object X, then A is likely to have same opinion as B on another object Y ,rather than the opinion of a randomly selected person .Netflix uses collaborative filtering to recommend movies to it’s users . The first step in collaborative filtering based recommender systems is to build a neighborhood of like- minded customers who appear to have similar preferences .The next step is to use the ratings from those like-minded users found in the previous step to calculate a prediction for the active user. CF Recommender systems also suffer from cold start problem .That is the collaborative filtering might not work, if there are no substantial users who rated an object. Also new users will need to rate sufficient number of items to enable the system to capture their preferences accurately and thus provides reliable recommendations. As fore-mentioned CF takes into account user similarity in predicting ratings for a user .the user similarity can be calculated either by cosine similarity or by Pearson similarity. The problem with CF is that user similarity matrix for a large data base of users is sparse. In fact, the process of comparing two users in order to compute the similarity involves comparing the ratings they provided for items. And it becomes necessary that the two users rated at least some of the same items. However in a typical domain, for example in the domain of movies or books, the number of items is very large (in the order of the millions) while the number of items rated by every single user is in general small (in the order of dozens or less). This means that it is very unlikely two random users have rated any items in common and hence they are not comparable. Another limitation of CF is that RS can easily be attacked by creating ad hoc user profiles with the goal of being considered as similar to the target user and influence the recommendations (copy – profile attacks ) . The solution is to enhance CF with Trust derived from Social Networks like twitter, Facebook or epinions.com.With the development of web 2.0 and various social networking sites a user’s network of friends and people of interest becomes more accessible.(Facebook ,LinkedIn, Twitter ,…). To improve the recommendation accuracy, user’s social network should be taken into consideration .Such social relationships can be very effective for recommendation compared to the traditional Collaborative filtering. People who are socially connected are more likely to share the same or similar interests. Users are easily influenced by the friends they trust and often prefer their friends’ recommendations rather than the ratings given by some anonymous user though there exits certain similarity between them. Merging social networks based trust and recommender systems can improve the accuracy of recommendations and improves the user’ experience. The questions that need to be answered when taking trust metric into consideration are whom to trust, how much a friend in a social network can be trusted and what information to trust . Deriving Trust: Trust can be derived either explicitly or implicitly .As the name suggests Explicit trust denotes the trust values indicated explicitly by the user. The limitations of deriving trust explicitly is that users on an average have very few links (trusted sources).Hence a particular user can explicitly rate his friends and not those who are his friends of friends .Also it involves more user’s effort to rate every one of friends . One good example of this explicit trust, is the Epinions.com (a general consumer review site).Here users can create a profile, define explicit trust on other authors and also create a web of trust of users, whose ratings of items they are likely to take into consideration .Also users can block those authors whose reviews they felt offensive, inaccurate and not valuable. The other way to is to derive trust implicitly. Implicit trust is the trust value inferred from some evidence such as feature of similarity between the users, interactions between the users in the social networks etc. For example in Facebook, the amount of trust between two users can be determined by the features such as the number of mutual friends between the users, interactions between them and profile similarity .In Twitter, the trust information between two users can be determined by the factors such as the number of re-tweets and mentions of the user in a tweet .Now let’s see a model proposed for calculating the implicit trust in the social networking site twitter. Twitter users interact in three ways: (a) following other user tweets (b) re-tweeting other user’s tweets © mentioning another user or (d) favoring another users’ tweet. The action of a user X following user Y means that X wants to receive all the tweets published by b. A user can start following someone for many reasons. The user might have been recommended by a friend or by Twitter itself. Hence much information regarding trust cannot be gathered from following. The other action is re tweeting a tweet .A user X re tweeting a tweet from user Y implies that X found the content of tweet interesting .Hence it is reasonable to think that that the more X re tweets Y ,the more X trusts Y. Similarly mentions can be used to infer trust. Then the trust between two users can be implicitly calculated as t(i,j) = wNm(i,j)+(1-w)Nr(i,j) N(i) where N(i) is the number of interactions from i to j and w is the weight assigned to retweets against mentions.Nm is the number of mentions and Nr is the number of re tweets . Trust Inference: Trust can be calculated either globally or locally. Global trust metrics compute a single global trust value every single user (reputation) in the network .The pros the global trust are that it takes into account the wisdom of crowds to determine trust and is simple to compute and the drawback is as trust is subjective in nature global trust may not be taken as a clear indicator of trust. PageRank qualifies as a Global trust metric as it computes the importance of every single web page based on the number of incoming edges and their respective importance weights. This can be used to determine the trust of every node in a trust network. Local Trust metrics predict (different) trust scores that are personalized from the point of view of''' every single user. The pros the local trust metrics are that it is more accurate than global trust because trust for every node in the trust network is calculated with respect to a single active user and hence is attack resistant. The cons are that it ignores the wisdom of crowds and is also complex to calculate. Trust Propagation Trust propagation is necessary in order to calculate the trust between an source (an active user ) and a sink ,when there is no direct trust between them .In order to propagate trust find all trust paths from source to target and propagate trust along trust paths. The rules that hold while propagating trust are 1. There is trust decay with every hop 2. An user cannot propagate more than the received trust 3. Distrust blocks the propagation. The two kinds of local trust propagation algorithms are Tidal trust and Mole trust. Tidal trust between a source and sink node is calculated according to the following rules. 1.If the source node does not know the sink, the source asks all of its friends how much to trust the sink, and computes a trust value by weighted average.2.Neighbors repeat the process if they do not have a direct rating for the sink .Tidal trust is a modified breadth first search algorithm Here max value is chosen such that there is at least one path between the source and the sink .Here max is the minimum value of trust between two nodes for the trust to propagate . Example: Consider the trust network Here consider Alice as the active user (source) and Eve as the sink whose trust w.r.t to Alice is to be calculated. Set the trust threshold to 0.5 and make trust graph acyclic. The predicted trust score of a user is the average of all kept incoming trust edge values, weighted by the trust score of the user who has issued the trust statement. The predicted trust score of Eve is (0.8*0.6+1.0*0.9)/ (0.8+1.0) = 0.767 Mole trust is similar to tidal trust but with a tunable trust propagation horizon .There is linear decay based on the distance from the source .Maximum propagation distance (mpd) is an important parameter beyond which there is no trust propagation. Trust derived for a node with mpd =1, 2, 3, 4 are called Trust-1, Trust-2, Trust -3 and Trust- 4 respectively. Prediction of rating based on trust: Trust based weighted mean: Here the trust value the trust value present between users u ''and ''v ''as a weight for the ratings of other users in place of the similarity measure. This is the technique used by the Tidal Trust algorithm. Trust based collaborative filtering: In this method, the similarity weight attributed to ratings by user ''v ''for user ''u, calculated using PCC is replaced with the value of '' trust t.This technique is used in Mole trust algoriithm.Trust based collaborative filtering: In this method, the similarity weight attributed to ratings by user ''v ''for user ''u, calculated using PCC is replaced with the value of '' ''trust.This technique is used in Mole trust . Experiment conducted on epinions .com reveals how trust solves traditional Rs problems .Here users can review and rate items ,create a web of trust .Data set for the experiment is ~ 50 k users ,~140 k items ,~660 k ratings ,~ 500k trust statements . Mean number of comparable users with different methods: Trust and Pearson correlation 'coefficient. For trust, we indicate the mean number of users reach able through some trust chain in at most ''x steps. For Pearson, we indicate the mean number of users against which Pearson coefficient is computable (i.e. overlap of at least 2 items). Both are computed over every user (even the ones with 0 ratings or 0 friends). How does trust solve RS‘s problems? In Collaborative filtering user similarity is often not computable Using trust propagation and “ six degrees “,trust can be predicted for many users .With few trust steps it is possible to reach every person in the world. For cold start users, add trust path to just one friend and that trust can be propagated along the network. It is resistant against copy - profile attack .It is transparent where as traditional Rs’ are mainly conceived and perceived as black boxes